Paired-end small RNA sequencing reveals a possible overestimation in the isomiR sequence repertoire previously reported from conventional single read data analysis

نویسندگان

چکیده

Abstract Background Next generation sequencing has allowed the discovery of miRNA isoforms, termed isomiRs. Some isomiRs are derived from imprecise processing pre-miRNA precursors, leading to length variants. Additional variability is introduced by non-templated addition bases at ends or editing internal bases, resulting in base differences relative template DNA sequence. We hypothesized that some component isomiR variation reported so far could be due systematic technical noise and not real. Results have developed XICRA pipeline analyze small RNA data level. exploited its ability use single merged reads compare results paired-end (PE) with those (SR) address whether detectable sequence canonical miRNAs found true biological variations result errors sequencing. detected non-negligible between SR PE which primarily affect putative internally edited isomiRs, a much smaller frequency terminal changing This relevant for identification datasets. Conclusions conclude potential artifacts and/or an overestimation abundance diversity isoforms. Efforts annotating isomiRnome should take this into account.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantifying Alternative Splicing from Paired-end Rna-sequencing Data.

RNA-sequencing has revolutionized biomedical research and, in particular, our ability to study gene alternative splicing. The problem has important implications for human health, as alternative splicing may be involved in malfunctions at the cellular level and multiple diseases. However, the high-dimensional nature of the data and the existence of experimental biases pose serious data analysis ...

متن کامل

Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA

Whole transcriptome sequencing by mRNA-Seq is now used extensively to perform global gene expression, mutation, allele-specific expression and other genome-wide analyses. mRNA-Seq even opens the gate for gene expression analysis of non-sequenced genomes. mRNA-Seq offers high sensitivity, a large dynamic range and allows measurement of transcript copy numbers in a sample. Illumina's genome analy...

متن کامل

A Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data

Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...

متن کامل

A Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data

Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...

متن کامل

End Sequence Analysis Toolkit (ESAT) expands the extractable information from single-cell RNA-seq data.

RNA-seq protocols that focus on transcript termini are well suited for applications in which template quantity is limiting. Here we show that, when applied to end-sequencing data, analytical methods designed for global RNA-seq produce computational artifacts. To remedy this, we created the End Sequence Analysis Toolkit (ESAT). As a test, we first compared end-sequencing and bulk RNA-seq using R...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: BMC Bioinformatics

سال: 2021

ISSN: ['1471-2105']

DOI: https://doi.org/10.1186/s12859-021-04128-1